PCT 

INTERNATIONAL APPUCATION PUBLISHED UNDE1 



WORLD INTELLECTUAL PROP 
International fc 






HO 96087 61A1 



(51) International Patent Classification 5 
G06F 7/48 



Al 



(11) International Publication Number: WO 96/08761 

(43) International Publication Date: 21 March 1996 (21.03.96) 



(21) International Application Number: PCIYUS95/1 1589 

(22) International Filing Date: 13 September 1995 (13.09.95) 



(30) Priority Data: 

08/307.932 



16 September 1994 (16.09.94) US 



(71) Applicant: THE RESEARCH FOUNDATION OF STATE 
UNIVERSITY OF NEW YORK [US/US]; Suite 200 UB 
Commons, State University Plaza, Amherst, NY 14228- 
2567 (US). 

(72) Inventors: SRIDHAR, Ramalingam; 199 Brockmoorc Drive, 
East Amherst, NY 14051 (US). ZHANG, Xuguang; 18946 
Vickie Avenue, Cerritos, CA 90701 (US). 

(74) Agent: VIKSNINS, Ann, S.; Schwegman, Lundberg & Woess- 
ner, P.O. Box 2938, Minneapolis, MN 55402 (US). 



(81) Designated States: CA, JP, European patent (AT, BE, CH, DE, 
DK, ES, FR, GB, GR. IE, IT, LU, MC. NL. PT, SE). 



Published 

With international search report. 



(54) Title: COMPLEMENTARY FIELD-EFFECT TRANSISTOR LOGIC CIRCUITS FOR WAVE PIPELINING 

^215 




1 ^ 

a«, I ?22_ 



TRANSMISSION GATE 

(57) Abstract 

A familv of CFET losric circuits (100. 200. 202 and 360) useful for wave-pipeline systems is described, and a method to design same 
lie in^u^^ logic gates (221 222; 12U 12*ag 321, 

transistors to achieve a familv of CFET logic circuits which include AND, NAND, OR, NOR, XOR, AMUR, select, seiecr invcrv *uiu 
S ta£o£ • Sjf SoT 20O202 and 360) is tuned to provide substantially equal delays. high-quality ones and zeros, and 
substantially equal rise and fall times, for every combination of input-state transition and output-state transition. 



i 



CO 

m 
O 

o 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States 
applications under the PCT. 



AT 


Austria 


AU 


Australia 


BB 


Barbados 


BE 


Belgium 


BF 


Burkina Faso 


BG 


Bulgaria 


BJ 


Benin 


BR 


Brazil 


BY 


Belarus 


CA 


Canada 


CF 


Central African Republic 


CG 


Congo 


CH 


Switzerland 


a 


C6te d'lvoire 


CM 


Cameroon 


CS 


China 


CS 


Chechoslovakia 


CZ 


Czech Republic 


DE 


Germany 


DK 


Denmark 


ES 


Spain 


FI 


Finland 


FR 


Prance 


GA 


Gabon 



party to the PCT on the front pages 



GB 


United Kingdom 


GE 


Georgia 


GN 


Guinea 


GR 


Greece 


HU 


Hungary 


IE 


Ireland 


IT 


Italy 


JP 


Japan 


KE 


Kenya 


KG 


Kyrgystan 


KP 


Democratic People's Republic 




of Korea 


KR 


Republic of Korea 


KZ 




U 


Liechtenstein 


LK 


Sri Lanka 


LU 


Luxembourg 


LV 


Latvia 


MC 


Monaco 


MD 


Republic of Moldova 


MG 


Madagascar 


ML 


Mali 


MN 


Mongolia 



pamphlets publishing international 



MR 


Mauritania 


MW 


Malawi 


NE 


Niger 


NL 


Netherlands 


NO 


Norway 


NZ 


New Zealand 


PL 


Poland 


PT 


Portugal 


RO 


Romania 


RU 


Russian Federation 


SD 


Sudan 


SE 


Sweden 


SI 


Slovenia 


SK 


Slovakia 


SN 


Senegal 


TD 


Chad 


TG 


Togo 


TJ 


Tajikistan 


TT 


Trmidad and Tobago 


UA 


Ukraine 


US 


United States of America 


U2 


Uzbekistan 


VN 


Viet Nam 



> 

WO 96/08761 PCT7US95/I1589 



CX)MF1JHV1ENTARY FDELD-EFFECT TRANSISTOR 
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Field of the Invention 
The present invention relates to digital logic circuits and more 
specifically to Complementary Field-Effect Transistor logic circuits suitable 
10 for wave pipelining. 

Rarkgrniind of the Invention 
Conventional Complementary Field-Effect Transistors ("CFET') logic 
circuits include N-channel field-effect transistors ("NFET') and P-channel 
field-effect transistors ("PFET"). In the following description the terms 
15 CFET, NFET, and PFET should be interpreted to include all field-effect 
transistor integrated circuit technologies. Metal-Oxide Semiconductor 
("MOS") processes are often used to fabricate Field-Effect Transistors ("FET") 
logic circuits. As used in this description, the terms MOS and FET are 
interchangeable. 

20 Conventional logic-circuit-design techniques contemplate increasing 

the throughput of a system with a "pipeline". The pipeline comprises a 
number of logic sections, each separated by a register section. Each system 
clock transition allows a "data signal" (herein also simply called "signal") to 
propagate from one register section, through the following logic section, and 

25 to the inputs of the following register section. Typically, new signal inputs 
are not fed into a logic section until the previous signal outputs are latched 
into the register section following that logic sectioa The maximum clock 
frequency for a logic section (Le., the frequency with which new data can be 
switched into a logic section) is limited by the maximum propagation delay of 

30 a path through that logic section. 

One way of increasing system throughput is to break up logic sections 
into smaller sections (each with a shorter propagation delay) and insert 
pipeline register-section levels to separate the smaller logic sections. The 
clock speed can thai be increased to take advantage of the shorter logic- 

35 section delays. 
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This "pipelining" technique has been used to obtain significant speed- 
up of a computer system. Figure la illustrates conventional pipelining, 
showing the edges of signals propagating though small combinational-logic 
blocks. Conventionally, a combinational-logical-function unit is partitioned 
5 into several smaller combinational-logic blocks, and register stages are 
inserted between adjacent combinational-logic blocks as the synchronizers. 
However, the inserted register stages contribute to increased physical area and 
added clock-distribution requirements, resulting in a limitation on 
performance. 

10 The increasing demand for high-speed, compact devices and systems, 

and the limitations of existing design methods, have prompted researchers to 
look for alternate techniques that can lead to high-performance digital 
systems. One such method is called "wave pipelining". Wave pipelining 
eliminates intermediate register stages in a pipeline system by using the 

15 internal capacitance of a combinational block for storage. Wave-pipelined 
systems do, however, have stria requirements on (a) the uniformity of path 
delays, (b) uniformity of output-signal rise and fall times, and (c) the 
independence of delay from the pattern of input signal transitions. 

Figure lb shows one embodiment of a wave-pipelining 

20 technique. In Figure lb, the internal capacitances in the combinational logic 
act in effect as temporary storage elements. These dynamic storage elements 
take the place of static registers used in the conventional pipelining method 
shown in Figure la. Undo* the approach shown in Figure lb, new data values 
are latched in before the previous data values propagate to the next set of 

25 registers. In this way, there are multiple coherent data "waves" within the 
combinational-logic block. Hence, the system clock is much faster than the 
propagation delay of the combinational-logic block between adjacent system- 
clocked-register stages. 

The concept of wave pipelining (also called "maximum-rate 

30 pipelining") was first described by Cotten [Cotten:69] and Anderson, et al. 
[Anderson:67], and was applied in the design of IBM360/91 floating-point 
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execution unit in the 1960's. The significant advantages of wave pipelining 
are: 

(1) Achieving very high pipeline rates that approach the physical 
speed limit of the technology; 
5 (2) Increasing pipeline rate without significant latency increase; 

(3) Minimizing clock loading and reducing clock-distribution 
problems; and 

(4) Using fewer registers and reducing the area overhead otherwise 
required by conventional pipelining. 

10 To obtain a high operating speed, each path through a given functional 

block must have similar path delays. This requires symmetric rise and fall 
times (collectively called "transition" times) of output signals, and, for each 
component within the logical-functional block, delays that are independent of 
the input-signal transition patterns. Wave-pipelined systems are susceptible to 

15 process and environmental variations which will cause propagation-delay- 
variation problems [Klass:93b]. 

Recently, with the demanding digital system speed and throughput 
requirements of various applications, wave-pipelining has received 
considerable attention from many research groups [Wong:93] [Fan:92] 

20 [Klass:92] [Zhang:93]. In addition, Ekroot [Ekroot:87] developed a theory of 
wave pipelining and a linear program to insert delay elements to balance the 
circuit with the assumptions of fixed gate- and module delays. 

Wong et al. [Wong:93] [Wong:91] continued their initial research and 
developed the algorithms to automatically equalize delays in bipolar 

25 combinational logic circuits to achieve a high degree of wave pipelining. 
These authors have also reported the results of a 63-bit population counter 
using CML (Common-Mode Logic) bipolar technology, and discussed the 
limitations of using standard CMOS technology for wave pipelining. 

Fan et al. [Fan:92], and Klass and Mulder [Klass:92] studied the use 

30 and limitations of CMOS technology for wave pipelining. They designed 
wave-pipelined CLA (Carry Look-Ahead) adders and showed performance 
improvement over conventional methods. 



WO 96/08761 PCT/US95/1 1589 

4 

Lam et al. [Lam:92] analyzed valid clocking in wave-pipelined circuits 
using Timed Boolean Functions. 

Joy and Ciesieski [Joy:91] have proposed certain physical placement of 
components and specific routing algorithms for laying out wave-pipelined 
5 circuits. Klass, Flynn and Goor reported the design of a fast CMOS wave- 
pipelined multiplier [Klass :93b] [Klass:93a]. 

The timing constraints of wave-pipelined circuits have been carefully 
studied and discussed by several research groups. In summary, for a wave- 
pipelined system using edge-triggered registers, the minimum clock-period 
10 relation should be [Cotten:69] [Klass;92] [Wong:91]: 

t^>Ma>c{(& p +(2*AQ + f*,+ AC {Equation 1} 

where the variables are defined as 
tcp is the valid clock period, 

At p is the maximum time difference between the longest and shortest 
15 paths for the worst-case design, 

AC is the worst-case clock skew, 
t s is the setup time for registers, 
t h is the hold time for register, 

is the worst-case rise/fall time at the last logic stage, 
20 Af x is the maximum time difference between the longest and shortest path 

from the global inputs to an internal signal node X, and 

is the minimum stable time for X to insure the correct operation of the 

next logic stage. 

Both transition times and signal-propagation delays must be 
25 constrained to avoid data wave interference. The clock period time limit to 
prevent interference of a data wave with any previous data wave at the ending 
storage element of a wave-pipelined logic section is bounded by > (& p + 
(2*AQ + t s + t h + ) . The clock period time limit to prevent interference of 
a data wave with any previous data wave inside a section of combinational 
30 logic is bounded by > (& x + AC +/ mj + 1#) . 

To achieve maximum wave-pipeline rate, designers should minimize 
in Equation 1. Here, it is assumed that the clock skew AC can be minimized 
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by conventional design techniques, and that the terms t s , t h , t rf , and are 
technology-dependent parameters and specific to a certain logic stage, so they 
can be optimized individually. The remaining terms, and zV x , arise from 
the following possible sources: 
5 (1) path differences due to practical circuit configurations, 

(2) data-dependent signal-delay variations, and 

(3) process- and temperature-induced variations. 

As some process- and temperature-induced variations are unavoidable, the 
focus should be on the path differences that are due to practical circuit 

10 configurations and data-dependent delay variations. Therefore, if possible, a 
wave-pipelined circuit should be designed to have balanced paths (in terms of 
the basic logic gates and delay elements) in order to keep & p and Af x as 
close to zero as possible. 

Unfortunately, most practical digital circuits do not have such balanced 

15 configurations. Therefore, specific algorithms have been suggested for 

designing practical wave-pipelined circuits by inserting delay elements ("rough 
tuning") and adjusting gate-driving abilities ("fine tuning 11 ) [Wong:93] 
[Wong:89]. 

Even for a balanced circuit, the data-dependent delay variations of 
20 logic gates can still contribute to the values of and & x . This fact 

establishes that, from the viewpoint of circuit designers, the minimum clock 
period is eventually bounded by the delay variations of the basic logic circuit 
used in a wave-pipelined system. Therefore, the choice of the circuit family 
for the wave-pipelined system design can have a significant impact on 
25 performance through the effect of delay variations at the gate level. A set of 
ideal properties of the basic circuits for wave pipelining can be summarized as 
follows: 

(1) same gate delay for both rising and falling edges of output signal, 

(2) no variation in the gate delay due to different input patterns, and 
30 (3) no variation in the gate delay due to different previous input 

patterns. 
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By examining these requirements, it was found that bipolar circuit families 
(Emitter-Coupled Logic ("ECL"), super-buffered ECL, and Common-Mode 
Logic ("CML")) are good candidates for wave pipelining [Wong:93]. 
Standard CMOS was not well suited for this technique, since CMOS gate 
5 delay depends strongly on the input patterns or different signal-timing patterns 
[Klass:92] [Fan:92]. For example, the standard prior-art two-input CMOS 
NAND gate 10 shown in Figure lc has two transistors in parallel (21 and 22) 
and two transistors in series (23 and 24). The physical characteristics of 
transistors 23 and 24 can be designed so that together they pull output 31 

10 down to a logic "zero" at a rate corresponding to the rate that transistors 21 
and 22 together can pull output 31 up to a logic "one". In such an 
embodiment, if input signals 1 1 and 12 both start at "one", and both switch to 
"zero", transistor 21 and transistor 22 will both switch, driving output 31 
from ground potential 14 to Vqd voltage 15. If, however, only a single input 

15 switches to "zero" (e.g., input 1 1), only a single transistor (e.g., transistor 21) 
will pull output 31 to Vqd voltage 15. Since there is some capacitance 
associated with output 31, when both transistors 21 and 22 are pulling output 
31, output 31 will switch faster than if either transistor 21 or 22 alone is 
driving output 31. Therefore, in CFET NAND gates, rise times vary as a 

20 function of the input state transitions. 

Since CMOS technology is a dominant and mature technology in the 
modem semiconductor industry, and has certain unique positive features for 
digital system design, it is necessary to attack the practical problems of 
unequal delays and asymmetric rise and fall times and to explore novel design 

25 techniques that are suitable for CMOS wave pipelining. Researchers have 
studied the basic logic-circuit issues of CMOS wave-pipelining technique and 
have proposed some solutions. For instance, in [Fan:92] and [Gray:91], the 
basic logic circuits used are an inverter (not shown) and a two-input cross- 
coupled pseudo-NMOS NAND gate 40 (shown in Figure Id), which is formed 

30 by stacking cross-coupled n-channel transistors under a p-channel active pull- 
up device with bias voltage Vb. Since, however, the bias voltage Vb has to 
be distributed all over the wave-pipelined circuit chip, and the gate delay is 
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sensitive to the bias-voltage value, careful routing is needed to insure proper 
functioning of the circuit [Fan:92]. 

In an alternative approach, a balanced CMOS NAND gate (Figure le) 
is proposed in [Klass:92] to reduce the static CMOS gate-delay variations by 
5 adding a redundant ground-biased PMOS device to "soften" the input-pattem- 
dependent delay variation. This approach, however, has the drawbacks of 
increased layout area, loading capacitance, gate delays and dynamic power 
dissipation. 

Klass [Klass:93a] describes a wave-pipelining circuit using standard 
10 CMOS logic gates. In [Klass:93b] and [Klass:93a], a conventional static 
CMOS NAND gate and an inverter were used as the basic circuits; however, 
the design was restricted to use 2-input NAND gates and inverters for every 
logic function, to minimize the delay sensitivity of the circuit to the input data 
patterns. In addition, every function block had to be verified separately to 
15 avoid large delay variations. 

Each of the above approaches use only 2-input NAND gates and 
inverters as the basic circuits to implement arbitrary logic functions. This 
constraint can lead to a large chip area, and will limit the applications of wave 
pipelining. 

20 Wong [Wong:93] presents an algorithm for designing a wave- 

pipelining circuit with minimal area and minimal power consumption. The 
algorithm involves: (1) rough tuning, by adding delay elements to balance 
circuit paths; and (2) fine tuning, by adjusting gate drives to compensate for 
delay variations introduced by different "fanouts" (the number of loads; in 

25 CFET technology this is primarily the sum of the capacitive load of each gate 
driven by the output driver, plus the capacitance of inter-circuit wiring). 

Other FET logic families have also been explored. For instance, 
Complementary Pass-transistor Logic ("CPL") has proven to be a high-speed, 
area-efficient, and low-power technique [Yano:90] [Weste:93] 

30 [Shimohigashi:93], Figure If shows an example of a basic prior-art CPL 
logic circuit 60 [Yano:90]. In the embodiment shown in Figure If, the same 
circuit is used to implement AND, NAND, OR, and NOR functions; the 
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function is determined by selection of the signals provided at the circuit 
inputs. The design method presented by Yano et al. [Yano:90] had no p- 
channel transistor in the pass network. Dual input signals and n-channel pass- 
transistors were used to implement dual-output gate circuits. 
5 The circuit shown in Figure If does have drawbacks. Circuit 60 does 

not make efficient transitions with respect to logic-high input signals because 
of the poor "one" conduction problem of the NMOS pass-transistors (the 
maximum voltage for logic "one" is bounded by V^-Vr ). So Yano et al. 
[Yano:90] utilized a specific fabrication technology, in which NMOS pass- 

10 transistors 62 were designed to have a zero threshold voltage V r =0 volts, 
whereas the other NMOS and PMOS transistors had a V T = ±0.4 volts, 
respectively. With this design method, the quality of the logic-high is indeed 
improved, but noise immunity and reliability are reduced. In addition, the 
special fabrication requirements limit its wide application. 

15 None of the above methods appear to teach how to design a family of 

field-effect-transistor-based circuits which provide substantially equal delays 
regardless of the pattern of the input logic-state transitions, and which provide 
a high-quality logic one as well as a high-quality logic zero. 

Summary of the Invention 

20 The present invention is a family of CFET logic circuits useful for 

wave-pipeline systems, and a method to design same. The invention uses 
complementary transmission gates and pull-up or pull-down transistors to 
achieve a family of CFET logic circuits which include AND, NAND, OR, 
NOR, XOR, XNOR, select, select-invert, invert, and not-invert functions. 

25 Each circuit is tuned to provide substantially equal delays, high-quality logic 
ones and zeros, and substantially equal rise and fall times for every 
combination of input-state transition and output-state transition. 

According to one aspect of the present invention, a circuit is described 
which can be used for AND, NAND, OR, or NOR functions, depending on 

30 the input connections. This circuit includes a first pass transistor having a 
first terminal coupled to a first input signal, a second terminal coupled to an 
internal node, and a gate coupled to a second input signal; a second pass 
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transistor having a first terminal coupled to the first input signal, a second 
terminal coupled to the internal node, and a gate coupled to a logical 
complement of the second input signal; a third transistor having a first 
terminal coupled to a voltage source, a second terminal coupled to the internal 
5 node, and a gate coupled to the second input signal; and a driver coupled to 
the internal node, the driver comprising means for amplifying a voltage, 
adjusting logic levels, and providing an output signal. If the voltage source 
coupled to the third transistor is a "one" level, the circuit can be used as an 
AND or NOR gate. If the voltage source coupled to the third transistor is a 
10 "zero" level, the circuit can be used as a NAND or OR gate. In one 

embodiment, the first transistor has a first channel type, the second transistor 
has a second channel type, and the third transistor also has the second channel 
type. In one such embodiment, the first channel type is N-channel, and the 
second channel type is P-channel. In another embodiment, the first channel 

1 5 type is P-channel, and the second channel type is N-channel. 

According to another aspect of the present invention, a circuit is 
described which can be used for XOR, XNOR, select, or inverse-select 
functions. This circuit includes a first pass transistor having a first terminal 
coupled to a first input signal, a second terminal coupled to an internal node, 

20 and a gate coupled to a second input signal; a second pass transistor having a 
first terminal coupled to the first input signal, a second terminal coupled to 
the internal node, and a gate coupled to a logical complement of the second 
input signal; a third pass transistor having a first terminal coupled to a logical 
complement of the first input signal, a second terminal coupled to the internal 

25 node, and a gate coupled to a logical complement of the second* input signal; 
a fourth pass transistor having a first terminal coupled to a logical 
complement of the first input signal, a second terminal coupled to the internal 
node, and a gate coupled to the second input signal; and a driver coupled to 
the internal node, comprising means for amplifying a voltage, adjusting logic 

30 levels, and providing an output signal. In one embodiment, the first and third 
transistors have a first channel type, and the second and fourth transistor have 
a second channel type. In one such embodiment, the first channel type is N- 
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channel, and the second channel type is P-channel. In another embodiment, 
the first channel type is P-channel, and the second channel type is N-channel. 

According to yet another aspect of the present invention, a circuit is 
described which can be used for generating dual-rail signals from single-rail 
5 signals, or for inverting or non-inverting delay buffers. This circuit includes a 
first pass transistor having a first terminal coupled to a first input signal, a 
second terminal coupled to an internal node, and a gate coupled to a first 
voltage source; a second pass transistor having a first terminal coupled to the 
first input signal, a second terminal coupled to the internal node, and a gate 

10 coupled to a second voltage source; and a driver coupled to the internal node, 
the driver comprising means for amplifying a voltage, adjusting logic levels, 
and providing an output signal. 

According to yet another aspect of the present invention, a method is 
described for designing a CFET logic circuit having a uniform overall gate 

15 delay. The method comprises the steps: forming a Karnaugh map of the 
desired function; assigning each cell in the Karnaugh map to a pair of 
adjacent cells; implementing a transmission gate for each pair of adjacent 
Karnaugh-map cells having one high value and one low value; implementing a 
pull-up transistor for each pair of adjacent Karnaugh-map cells having two 

20 high values; implementing a pull-down transistor for each pair of adjacent 
Karnaugh-map cells having two low values; and adjusting the sizes and/or 
speeds of the pull-up transistor, the pull-down transistor, and the transmission 
gate to make the overall gate delay and the transition times of the output 
signal substantially independent of input transition pattern. 

25 According to yet another aspect of the present invention, a method is 

described for designing a CFET logic gate-pair circuit having a delay 
substantially independent of input transition pattern, where the circuit has a 
first pull transistor, a second pull transistor having a channel type 
complementary to the channel type of the first pull transistor, and a first 

30 transmission gate comprising an NFET and a PFET. The method comprises 
the steps: providing a size for the first pull transistor; determining a size for 
the second pull transistor in order to ensure substantially equal rise and fall 
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times of the first and second pull transistors; determining a ratio of NFET size 
to PFET size of the first transmission gate to ensure substantially equal rise 
and fall times of the transmission gate; and detennining a ratio of the first 
transmission gate size to the first pull transistor size to ensure substantially 
5 equal transition times and substantially equal gate propagation delays. 
According to yet another aspect of the present invention, a 
complementary field-effect transistor logic circuit is described comprising a 
first pass transistor having a first channel type and having a first terminal 
coupled to a first input signal, a second terminal coupled to an internal node, 

10 and a gate coupled to a second input signal; a second pass transistor having a 
second channel type which is complementary to the first channel type and 
having a first terminal coupled to the first input signal, a second terminal 
coupled to the internal node, and a gate coupled to a logical complement of 
the second input signal; a third transistor having the second channel type and 

15 having a first terminal coupled to a voltage terminal, a second terminal 
coupled to the internal node, and a gate coupled to the second input signal; 
and a driver coupled to the internal node, comprising means for amplifying a 
voltage and adjusting logic levels at an output signal. 

According to yet another aspect of the present invention, a 

20 complementary field-effect transistor logic circuit is described comprising a 
first transistor for coupling a first input signal to an output signal in response 
to a second input signal; a second transistor for coupling the first input signal 
to the output signal in response to a logical complement of the second input 
signal; and a third transistor for coupling a logical-high signal to the output 

25 signal in response to the second input signal; wherein parameters of the first, 
second, and third transistors are chosen such that propagation delays for any 
combination of logical value transitions are substantially equal. 

According to yet another aspect of the present invention, a 
complementary field-effect transistor parallel-adder logic circuit is described 

30 comprising a plurality of pg generator circuits wherein each pg generator 

circuit comprises an AND/NAND gate circuit and a XOR/XNOR gate circuit; 
a plurality of black processor circuits wherein at least two of the black 
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processor circuits are coupled to outputs of the pg generator circuits and 
wherein each black processor circuit comprises a MUX/inverse-MUX gate 
circuit and an AND/NAND gate circuit; and a plurality exclusive-OR circuits 
coupled to at least two outputs of the black processor circuits. 
5 According to yet another aspect of the present invention, a 

complementary field-effect transistor 4:2 compressor logic circuit is described 
comprising a first OR/NOR gate coupled to a first input signal and a second 
input signal and producing a first internal OR/NOR signal; a second OR/NOR 
gate coupled to a third input signal and a fourth input signal and producing a 

10 second internal OR/NOR signal; a first AND/NAND gate coupled to the first 
internal OR/NOR signal and the second internal OR/NOR signal and reducing 
a carry-out signal; a second AND/NAND gate coupled to the first input signal 
and the second input signal and producing a first internal AND/NAND signal; 
a third AND/NAND gate coupled to the third input signal and the fourth input 

15 signal and producing a second internal AND/NAND signal; a third OR/NOR 
gate coupled to the first internal AND/NAND signal and the second internal 
AND/NAND signal and producing a third internal OR/NOR signal; a first 
XOR/XNOR gate coupled to the first input signal and the second input signal 
and producing a first internal XOR/XNOR signal; a second XOR/XNOR gate 

20 coupled to the third input signal and the fourth input signal and producing a 
second internal XOR/XNOR signal; a third XOR/XNOR gate coupled to the 
first internal XOR/XNOR signal and the second internal XOR/XNOR signal 
and producing a third internal XOR/XNOR signal; a fourth XOR/XNOR gate 
coupled to the third internal XOR/XNOR signal and a cany-in signal and 

25 producing an S signal; and a MUX/inverse-MUX gate coupled to the third 
internal XOR/XNOR signal and the carry-in signal and the third internal 
OR/NOR signal and producing a C signal. 

Brief Description of the Drawings 
Figure la is a schematic diagram illustrating a regular pipelining 

30 technique; 

Figure lb is a schematic diagram illustrating a wave-pipelining 
technique; 
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Figure lc is a schematic diagram illustrating a prior-art CMOS NAND 

gate; 

Figure Id is a schematic diagram illustrating a prior-art cross-coupled 
pseudo-NMOS NAND gate; 
5 Figure le is a schematic diagram illustrating a prior-art balanced- 

CMOS NAND gate; 

Figure If is a schematic diagram illustrating a prior-art Complementary 
Pass-transistor Logic (CPL) AND/NAND/OR/NOR gate; 

Figure 2a is a schematic flow diagram illustrating a method for 
10 designing a gate circuit from an inverse Karnaugh map according to the 
invention; 

Figure 2b is a schematic diagram illustrating an embodiment of a 
CFET logic circuit according to the invention having an output which has one 
high-level output state; 
15 Figure 2c is a schematic flow diagram illustrating a method for 

designing another gate circuit from an inverse Karnaugh map according to the 
invention; 

Figure 2d is a schematic diagram illustrating an embodiment of a 
CFET logic circuit according to the invention having an output which has 
20 three high-level output states; 

Figure 2e is a schematic diagram illustrating an embodiment of a 
CFET logic circuit according to the invention providing an inverting buffer, 
Figure 2f is a schematic diagram illustrating an embodiment of a pair 
of CFET logic circuits as shown in Figures 2b and 2d connected to provide an 
25 AND/NAND function; 

Figure 2g is a schematic diagram illustrating an embodiment of a pair 
of CFET logic circuits as shown in Figures 2b and 2d connected to provide an 
OR/NOR function; 

Figure 3a is a schematic diagram illustrating an embodiment of a 
30 CFET logic circuit according to the invention having an output which has two 
high-level output states; 
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Figure 3b is a schematic diagram illustrating an embodiment of a 
CFET logic circuit as shown in Figure 3a connected to provide an XOR 
function; 

Figure 3c is a schematic diagram illustrating an embodiment of a 
5 CFET logic circuit as shown in Figure 3a connected to provide an XNOR 
function; 

Figure 3d is a schematic diagram illustrating an embodiment of a 

CFET logic circuit as shown in Figure 3a connected to provide a 2-input 

multiplexor function; 
10 Figure 3e is a schematic diagram illustrating an embodiment of a 

CFET logic circuit as shown in Figure 3a connected to provide an inverse 2- 

input multiplexor function; 

Figure 4a is a schematic diagram illustrating an embodiment of a 

CFET logic circuit according to the invention having an inverting output state; 
15 Figure 4b is a schematic diagram illustrating an embodiment of a 

CFET logic circuit according to the invention having a non-inverting output 

state; 

Figure 5a is a schematic diagram showing an equivalent circuit for the 
1 1 ->00 and 1 1 ->10 input state transitions of the circuit in Figure 2b; 
20 Figure 5b is a schematic diagram showing an equivalent circuit for the 

1 1 ->00 and 1 1 ->10 input state transitions of the circuit in Figure 2d; 

Figure 5c is a schematic diagram showing an equivalent circuit for the 
1 1->01 input state transition of the circuit in Figure 2b; 

Figure 5d is a schematic diagram showing an equivalent circuit for the 
25 1 1 - >01 input state transition of the circuit in Figure 2d; 

Figure 5e is a schematic diagram showing an equivalent circuit for the 
01 - >1 1 input state transition of the circuit in Figure 2b; 

Figure 5f is a schematic diagram showing an equivalent circuit for the 
01 - >1 1 input state transition of the circuit in Figure 2d; 
30 Figure 5g is a schematic diagram showing an equivalent circuit for the 

10->1 1 input state transition of the circuit in Figure 2b; 
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Figure 5h is a schematic diagram showing an equivalent circuit for the 
10- >1 1 input state transition of the circuit in Figure 2d; 

Figure 5i is a schematic diagram showing an equivalent circuit for the 
00- >1 1 input state transition of the circuit in Figure 2b; 
5 Figure 5j is a schematic diagram showing an equivalent circuit for the 

00- >1 1 input state transition of the circuit in Figure 2d; 

Figure 6 is a schematic diagram showing a 16-bit carry look-ahead 
adder implemented with WTGL gates; and 

Figure 7 is a schematic diagram showing a 4:2 compressor circuit 
1 0 implemented with WTGL gates. 

Description of the Preferred Embodiment 
In the following detailed description of the preferred embodiments, 
reference is made to the accompanying drawings which form a part hereof, 
and in which are shown by way of illustration specific embodiments in which 
1 5 the invention may be practiced. It is to be understood that other embodiments 
may be utilized and structural changes may be made without departing from 
the scope of the present invention. 

Improved Complementary Pass-transistor Logic (CPL) circuits can be 
used as the basic cells to implement a high-performance CFET wave-pipelined 
20 system. This family of basic cells, called "Wave-pipelined Transmission-Gate 
Logic" ("WTGL"), can be designed to have substantially equal signal rise and 
fall times and reduced gate-delay variations. Each circuit uses a configuration 
of transmission gates, pull-up/pull-down transistors, and dual-rail input signals 
to perform a basic logic function; each circuit also has an invertor-driver to 
25 drive the next logic stage. In one embodiment, the invertor-driver is 
fabricated in CMOS, as shown in Figure 2e. 

A Karnaugh map can be used to design the basic WTGL cell. The 
procedure for designing a two-input AND gate is shown in Figures 2a and 2b. 
Please note that in these embodiments, since each basic cell is buffered by an 
30 invertor, the Karnaugh maps are shown for the logical complements of the 
desired functions. For instance, in Figure 2a, Karnaugh map 201 shows the 
map for the inverse of a two-input AND gate, which has three high states and 
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one low state. According to Karnaugh map 201, a pass network 200 with 
inputs of A, B, and "one" could be used to provide the function shown in 
Figure 2a, with driver-invertor circuit 232 providing the proper-polarity AND 
function. 

Figure 2b shows one embodiment of a pass network 200 which can be 
used to implement the AND function of Figure 2a Although the embodiment 
shown in Figure 2b uses CMOS transistors, persons skilled in the art will 
readily understand that any complementary field-effect transistor technology 
could be used to advantage. In the embodiment shown in Figure 2b, a CMOS 
transmission gate (transistors 221 and 222) and a pull-up transistor 224 have 
replaced the NMOS pass-transistor of the CPL designs shown in Figure If. 
Thus, the quality of logic "one" is guaranteed at the gates of output inverters 
even using standard CMOS technology. 

In this embodiment, n-channel pass transistor 221 and p-channel pass 
transistor 222 form a CFET transmission gate. (In this configuration, it is not 
particularly meaningful to distinguish transistor terminals as drain or source, 
since the relative voltage between input 211 and node 231 may be either 
positive or negative, depending on the states of inputs 21 1, 212, and 213. The 
physical layout is generally symmetric for the source, gate, and drain 
terminals. Therefore, rather than using the terms "source" and "drain", these 
drain/source transistor connections will each be called "terminals".) Input 21 1 
is coupled to one terminal of n-channel pass transistor 221, input 213 is 
coupled to the gate of n-channel pass transistor 221, and the other terminal of 
n-channel pass transistor 221 is connected to node 231. Input 21 1 is also 
coupled to one terminal of p-channel pass transistor 222, input 212 is coupled 
to the gate of p-channel pass transistor 222, and the other terminal of p- 
channel pass transistor 222 is connected to node 231. In this embodiment the 
substrates of the p-channel devices are internally connected to (voltage 
215 in this embodiment), and the substrates of the n-channel devices are 
internally connected to (the ground voltage 216 in this embodiment). 

The circuit of Figure 2b can be used to implement the AND function 
of Figure 2a. To do so, input 213 is connected to logic signal B of Figure 2a, 
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input 21 1 is connected to A of Figure 2a, and input 212 is connected to B of 
Figure 2a. The transmission gate formed by transistors 221 and 222 passes A 
from input 211 to node 231 if B is high, and is cutoff if B is low. 
voltage 215 is coupled to one terminal of p-channel pull-up transistor 224, 
5 input 213 is coupled to the gate of p-channel pull-up transistor 224, and the 
other terminal of p-channel pull-up transistor 224 is connected to node 231. 
Pull-up transistor 224 passes a "one" from V ro voltage 215 to node 231 if B 
is low, and is cutoff if B is high. Thus, Karnaugh map 201 represents the 
state of node 231 ; invertor driver 232 thai amplifies this voltage and inverts it 

10 (adjusting the logic level from negative to positive), thus providing the proper 
polarity AND function at output 233. 

Figures 2c and 2d show the corresponding procedure for designing a 
two-input NAND gate. In Figure 2c, Karnaugh map 101 shows the map for 
the inverse of a two-input NAND gate, which has three low states and one 

15 high state. According to the Karnaugh map 101, a pass network 100 having 
inputs of A, B, and "zero" could be used to provide the function shown in 
Figure 2c. 

Figure 2d shows one embodiment of a pass network 100 which can be 
used to implement the NAND function of Figure 2c. In this embodiment, n- 

20 channel pass transistor 121 and p-channel pass transistor 122 form a CFET 
transmission gate. Input 1 1 1 is coupled to one terminal of n-channel pass 
transistor 121, input 1 13 is coupled to the gate of n-channel pass transistor 
121, and the other terminal of n-channel pass transistor 121 is connected to 
node 131. Input 1 1 1 is also coupled to one terminal of p-channel pass 

25 transistor 122, input 1 12 is coupled to the gate of p-channel pass transistor 
12?,, and the other terminal of p-channel pass transistor 122 is connected to 
node 131. 

The circuit of Figure 2d can be used to implement the NAND function 
of Figure 2c. To do so, input 113 is connected to B of Figure 2c, input 111 
30 is connected to A of Figure 2c, and input 1 12 is connected to B of Figure 2c. 
The transmission gate formed by transistors 121 arid 122 passes A from input 
111 to node 131 if B is high, and is cutoff if B is low. Groundvoltage 114 is 
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coupled to one terminal of n-channe) pull-down transistor 123, input 1 12 is 
coupled to the gate of n-channel pull-down transistor 123, and the other 
terminal of n-channel pull-down transistor 123 is connected to node 131. 
Pull-down transistor 123 passes a "zero 11 from ground voltage 1 14 to node 131 
5 if B is high, and is cutoff if B is low. Thus, Karnaugh map 101 represents 
the state of node 131; invertor driver 132 then amplifies this voltage and 
inverts it (adjusting the logic level from negative to positive), thus providing 
the proper polarity NAND function at output 133. 

Figure 2f is a schematic diagram illustrating an embodiment of a pair 

10 of CFET logic circuits as shown in Figures 2b and 2d connected to provide an 
AND/NAND function. Paired circuit 202 produces AB and, at the same time 
and with the same delay, AB. This circuit can also be used to implement an 
OR/NOR function (e.g., A to input 21 1, B to input 213, and B to input 212 
provides the NOR function, NOT(A+B), at output 233; A to input 1 1 1, B to 

15 input 1 13, and B to input 1 12 provides the OR function A+B at output 133). 
Figure 2g is a schematic diagram illustrating an embodiment of a pair of 
CFET logic circuits as shown in Figures 2b and 2d connected as a paired 
circuit 202 to provide an OR/NOR function. 

Figure 3a is a schematic diagram illustrating an embodiment of a 

20 CFET logic circuit according to the invention having an output which has two 
high-level output states. This circuit is used to implement the exclusive-OR 
("XOR") function or the inverse-XOR function. In addition, this circuit is 
used to provide the multiplexor ("MUX") function and the inverse-MUX 
function. In the embodiment shown in Figure 3, n-channel pass transistor 321 

25 and p-channel pass transistor 322 form one CFET transmission gate; n-channel 
pass transistor 323 and p-channel pass transistor 324 form another CFET 
transmission gate. Input 31 1 is coupled to one terminal of n-channel pass 
transistor 321, input 313 is coupled to the gate of n-channel pass transistor 

321, and the other terminal of n-channel pass transistor 321 is connected to 
30 node 331. Input 31 1 is also coupled to one terminal of p-channel pass 

transistor 322, input 314 is coupled to the gate of p-channel pass transistor 

322, and the other terminal of p-channel pass transistor 322 is connected to 
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node 331 . Input 312 is coupled to one terminal of n-channel pass transistor 
323, input 315 is coupled to the gate of n-channel pass transistor 323, and the 
other terminal of n-channel pass transistor 323 is connected to node 331. 
Input 312 is also coupled to one terminal of p-channel pass transistor 324, 

5 input 316 is coupled to the gate of p-channel pass transistor 324, and the other 
terrninal of p-channel pass transistor 324 is connected to node 331. 

Figure 3b is a schematic diagram illustrating an embodiment of a 
CFET logic circuit as shown in Figure 3a connected to provide an XOR 
function. In this embodiment, inputs 313 and 316 are connected to B, input 

10 31 1 is connected to A, inputs 314 and 315 are connected to B, and input 312 
is connected to A. The transmission gate formed by transistors 321 and 322 
passes A from input 31 1 to node 331 if B is high, and is cutoff if B is low. 
The transmission gate formed by transistors 323 and 324 passes A from input 
312 to node 331 if B is low, and is cutoff if B is high. Thus, (AB + AB) 

1 5 represents the state of node 33 1 ; inverter driver 332 then amplifies this 

voltage and inverts it (adjusting the logic level from negative to positive), thus 
providing the proper polarity XOR function (AB + AB) at output 333. 

Another embodiment, shown in Figure 3c, uses a CFET logic circuit as 
shown in Figure 3a connected to provide the inverse-exclusive-OR ("XNOR") 

20 function. Inputs 313 and 316 are connected to B, input 31 1 is connected to 
A, inputs 314 and 315 are connected to B, and input 312 is connected to A. 
The transmission gate formed by transistors 321 and 322 passes A from input 
3 1 1 to node 33 1 if B is low, and is cutoff if B is high. The transmission gate 
formed by transistors 323 and 324 passes A from input 312 to node 331 if B 

25 is high, and is cutoff if Bis low. Thus, (AB + AB) represents the state of 
node 331; inverter driver 332 then amplifies this voltage and inverts it 
(adjusting the logic level from negative to positive), thus providing the proper 
polarity XNOR function (AB + AB) at output 333. 

Yet another embodiment, shown in Figure 3d, uses a CFET logic 

30 circuit as shown in Figure 3a connected to provide a 2-input multiplexor 
("MUX") function. Inputs 313 and 316 are connected to C, input 311 is 
connected to A, inputs 314 and 315 are connected to C, and input 312 is 
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connected to B. Thus, NOT(AC + BC) represents the state of node 33 1 ; 
invertor driver 332 then amplifies this voltage and inverts it, providing the 
proper-polarity MUX function (AC + BC) at output 333. 

Yet another embodiment, shown in Figure 3e, uses a CFET logic 
5 circuit as shown in Figure 3a connected to provide a inverse 2-input 

multiplexor ("inverse-MUX") function. Inputs 313 and 316 are connected to 
C, input 31 1 is connected to A, inputs 314 and 315 are connected to C, and 
input 3 12 is connected to B. Thus, (AC + BC) represents the state of node 
331; invertor driver 332 then amplifies this voltage and inverts it, providing 
10 the proper-polarity inverse-MUX function NOT(AC + BC) at output 333. 

Even when a logic function is not required, it is critical to maintain 
corresponding logic delays through each wave-pipeline section. To do this, 
the WTGL family of circuits includes a non-inverting and inverting logic 
circuits. Figure 4a is a schematic diagram illustrating an embodiment of a 
CFET logic circuit according to the invention having an inverting output state. 
Transmission gate 420 formed by transistors 421 and 422 always passes signal 
A from input 411 to node 431. Invertor 432 inverts this signal and provides 
A at output 433 with the same delay characteristics as the other above- 
described circuits of the invention. Because the transmission gate circuit is 
similar to those of Figures 2b, 2d, and 3, the delay characteristics can be 
adjusted to match those of the logic gates. 

Figure 4b is a schematic diagram illustrating an embodiment of a 
CFET logic circuit according to the invention having a non-inverting output 
state. Invertor 434 is designed to match the delay characteristics of a 
transmission gate such as transmission gate 420 of FIG 4a If input 412 is 
coupled to signal A, then node 435 will represent A. Invertor 436 then re- 
inverts the signal at node 435 and provides signal A at output 437 with the 
same delay characteristics as the other above-described circuits of the 
invention. 

In order to implement a high-speed wave-pipelined system, the basic 
cells must have good delay properties, and must be as insensitive to the input 
signal transition patterns as possible. Therefore, the delay characteristics of 
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WTGL gate circuits must be critically analyzed to evaluate the feasibility of 
each circuit for wave-pipelining design. 

The actual delay properties of the WTGL circuits in Figures 2b and 2d 
depend strongly on device sizing. This gate delay can be evaluated and 
5 compared by, for instance, observing the charging and discharging of internal 
nodes 231 and 131. 

Any of numerous methods well known to persons skilled in the art can 
be used to choose or adjust the parameters which affect the speeds of the 
various transistors to achieve overall gate delay balance, including but not 

10 limited to: adjusting the width-to-length ratio of the transistor gate of a field- 
effect transistor, adjusting the thickness of a gate insulator, adjusting the 
carrier or impurity density, choosing the semiconductor material (e.g., silicon 
or gallium-arsenide) and doping material (e.g., phosphorus or arsenic), and 
changing the capacitances associated with the various terminals of the 

15 transistor. 

The circuit in Figure 2b has two alternatively-conducting paths to node 
23 1 : one is pull-up transistor 224, the other is the transmission gate (TG) 
formed by pass transistor 221 and pass transistor 222. Similarly, the circuit in 
Figure 2d has two alternatively-conducting paths to node 131: one is pull- 

20 down transistor 123, the other is the transmission gate formed by pass 

transistor 121 and pass transistor 122. So with careful layout design of the 
basic circuit, it is possible to minimize the delay variations for all the input- 
pattern combinations by balancing the sizes of pull-up and pull-down 
transistors and the transmission gates. After detailed analysis of all the input- 

25 pattern combinations, four cases of 231 (or 131) node charging and 

discharging equivalent circuits were obtained, as shown in Figures 5a through 
5j. The dashed invertors at the inputs are the output driver-invertors of the 
previous stage. Dashed capacitors 239 and 139 are the equivalent lumped 
capacitances of the internal wiring and input gates of invertor-drivers 232 and 

30 132 respectively. 

Figure 5a is a schematic diagram showing an equivalent circuit for the 
1 1 ->00 and 1 1 ->10 input state transitions of the circuit in Figure 2b. Figure 
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5b is a schematic diagram showing an equivalent circuit for the 1 1 - >00 and 
1 1 - >10 input state transitions of the circuit in Figure 2d. Figure 5c is a 
schematic diagram showing an equivalent circuit for the 1 1 ->01 input state 
transition of the circuit in Figure 2b. Figure 5d is a schematic diagram 
5 showing an equivalent circuit for the 1 1 ->01 input state transition of the 
circuit in Figure 2d. Figure 5e is a schematic diagram showing an equivalent 
circuit for the 01 ->1 1 input state transition of the circuit in Figure 2b. Figure 
5f is a schematic diagram showing an equivalent circuit for the 01 ->1 1 input 
state transition of the circuit in Figure 2d. Figure 5g is a schematic diagram 

10 showing an equivalent circuit for the 10- >1 1 input state transition of the 

circuit in Figure 2b. Figure 5h is a schematic diagram showing an equivalent 
circuit for the 10- >1 1 input state transition of the circuit in Figure 2d. Figure 
5i is a schematic diagram showing an equivalent circuit for the 00 ->1 1 input 
state transition of the circuit in Figure 2b. Figure 5j is a schematic diagram 

15 showing an equivalent circuit for the 00- >1 1 input state transition of the 
circuit in Figure 2d. 

The goals in all the cases shown in Figures 5a through 5j are to 
balance the rise and fall times to each other for each output signal; once 
balanced, these are collectively called the "transition" time for the circuit. 

20 Then, the transition time for each circuit is balanced to equal the transition 
times of all the other circuits, to the greatest extent possible. Similarly, the 
propagation delay of each circuit must also be made substantially equal to the 
propagation delays of all the other circuits, to the greatest extent possible. 
The optimization method is given according to the actual switching behaviors 

25 of the circuit: 

(a) Pull-down NMOS sizing, 

(b) Pull-up PMOS sizing, 

(c) Transmission Gate rise and fell time balancing, and 

(d) Overall delay balancing. 

30 First, a reference delay is determined for a basic-size pull-down NFET 

device; thus, a size is chosen for transistor 123 of Figure 5b, and a simulation 
of the equivalent circuit of Figure 5b is run to determine the delay and rise 
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time of that circuit, which is then used as a reference delay. Then, a 
simulation of Figure 5a is run and the size of the pull-up PFET device, 
transistor 224, is determined to ensure the rise time for the pull-up transistor 
of Figure 5a equals the fall time for the pull-down transistor in Figure 5b. 
5 For Figures 5c, 5d, 5e, 5f, 5g, 5h, 5i, and 5j, the transmission gates are 
conducting to charge or discharge the 231 and 131 nodes. The optimized 
ratio of PFET to NFET size is determined in order to get substantially equal 
rise and fall times for the transmission gates; the size ratio of transistor 221 to 
transistor 222 will generally be the same as the size ratio of transistor 121 to 
10 transistor 122. Then, the whole transmission-gate size is adjusted to balance 
its delay with that of the appropriate pull-up or pull-down device. Since the 
parasitic effects are dependent on device size and layout style, the 
optimization procedure may need several iterations to achieve overall gate 
delay balance. 

15 In one embodiment, simulations are done using a simulator like 

SPICE3, with the circuit netlist file extracted from physical layout (developed 
using MAGIC) whenever the circuit layout changes. With careful circuit 
analysis and intensive simulations of various device sizing, cells with the 
required properties can be developed 

20 By performing these steps, the overall delay variations of the WTGL 

AND/OR gate (with output loading ranging from 0 to lpF) are considerably 
reduced compared to conventional static CFET technology. Similar balancing 
techniques can be used to minimize overall delay variations for the other 
circuits of the WTGL family. The result is a set of WTGL basic circuits. 

25 each with substantially similar delay and rise/fall times and each having dual- 
rail outputs, as follows: 

(a) a 2-input AND / OR / NAND / NOR circuit (e.g., Figures 2f and 

2gX 

(b) a 2-input XOR / XNOR circuit (e.g., Figures 3b and 3c), 
30 (c) a 2-to-l MUX circuit (e.g., Figures 3e and 3f), and 

(d) an invertor / non-invertor delay circuit for the interface between 
single-rail and dual-rail circuits and for inverting or non-inverting types of 
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delay element used as the padding elements (which are adjustable in terms of 
delay) (e.g., Figures 4a and 4b). 

As noted above, the circuits for AND/NAND and OR/NOR functions 
are actually the same, the only difference being the coupling of input signals. 
5 A similar convention is applicable for XOR/XNOR and MUX/inverse-MUX 
functions. For the XOR/XNOR and MUX in Figures 3b, 3c, 3d, and 3e, the 
simulation of equivalent circuits 5c through 5j include all the possible 
charging and discharging cases with different input-pattern combinations. So 
the optimization procedure is simpler than that of the AND/OR/N AND/NOR 

10 gate described above for Figures 2f and 2g. Most importantly, all the basic 
logic circuits have the same delay properties. Hence, in contrast to the mere 
single logic circuit used in other approaches, the present invention provides a 
family of basic logic circuits which can be used to implement wave-pipelined 
systems (as shown in Figures 2e, 2f, 2g, 3b, 3c, 3d, 3e, 4a, and 4b) and which 

15 can be designed to all have substantially the same timing properties. 

The dual-rail approach also has certain advantages over other 
techniques for wave-pipelined design. For instance, in single-rail systems, if 
the inputs of one logic level require both non-inverted and inverted terms 
(which is the most common case), and if only NANDs and inverters are 

20 available, thai one has to insert both an invertor and a delay element to get 
substantially equally-delayed dual signals. Also, all the other signals at the 
same logic stage should be delayed by the same amount to keep the timing 
balance. Such adjustments result in an increase in system delay and layout 
area. In contrast, the WTGL basic circuit family can generate dual signal 

25 outputs simultaneously and the overall timing variation will still be maintained 
at the same low level. 

Every wave-pipelined circuit must have substantially equal delay 
(balanced) paths under nominal fabrication conditions. Usually tuning is 
necessary to handle the unbalanced paths and various interconnections of 

30 practical circuits. The overall tuning procedure has two steps: 

(1) rough tuning, to insert additional delay elements to make all the 
paths roughly in balance, and 
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(2) fine tuning, to deal with the specific driving requirements of 
various signal connections, as well as to achieve minimization of 
power requirements. 

For the WTGL circuits, each output signal has a driving invertor which 
5 can be fine-tuned separately to balance the delay variations induced by the 
different fanouts in a practical wave-pipelined circuit. 

Recently, a Complementary Pass-transistor Logic (CPL) technique has 
been used by others to implement a wave-pipelined 8x8 multiplier fabricated 
in a normal CMOS process. Since the ideal maximum voltage swing at the 

10 output of an NMOS pass block is only from 0 to (V^-V^), the logic 

threshold voltage of the output invertor must be set accordingly to achieve full 
output logic swing. Therefore, during the fine-tuning stage, judicious sizing 
of the entire cell (both the output invertors and the NMOS pass transistors) 
was needed to adjust the driving ability of the basic circuit. In contrast, with 

1 5 the WTGL basic cells of the invention, the output invertors can be treated as 
single devices for fine tuning. A WTGL system has high regularity; all the 
internal signal nodes have, at most, one transistor and one transmission gate 
connected in series to or ground. Every stage has gate delays of the 
same magnitude (approximately equivalent to Tc^n^^. + Tc^) and each output 

20 signal has a separately-adjustable invertor. All of these characteristics are 
beneficial for practical CAD (Computer-Aided Design) tools development and 
logic synthesis. 

Practical Circuit Design and Comparisons 

25 In order to evaluate and verify the WTGL approach of the invention, 

several practical circuits have been designed Since no CAD tools for CMOS 
wave-pipelined circuit design have been reported, the rough tuning and fine 
tuning were performed manually. 

The results show that for the WTGL technique, since a family of basic 

30 circuit cells having the same magnitude of gate delays and reduced delay 
variations is available, higher speed and more compact practical wave- 
pipelined circuits can be implemented than can be implemented with other 
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approaches which use only one basic cell (a NAND gate). In addition, the 
actual circuit-design experience confirms that the high structural regularity and 
dual-rail signal property of WTGL technique are well suited for wave- 
pipelined circuit design. 

5 

Parallel Cairv-Look-Aheari Adder 

One of the practical circuits designed was a 16-bit parallel adder. 
During the adder design, a parallel architecture was adopted. The circuit was 
modified to take advantage of the special dual-rail characteristics and flexible 
10 logic functional choices of WTGL. 

The traditional propagation bit p t and generation bit g f of the i.th bit 

are: 

(ft >A ) = U*> , <h XOR*, ) {EQ.2} 
c, = Gj for i = 1, 2, n 
15 The G, is defined by an associate operator o introduced as follows: 

(<3 *Pi ) = (gi ,Pi ) far and 

= )o(G, / ,^. / ) = (&+/>, G„ 9 p, P hi )for/^BQ.3} 

After the cany bit c lW is computed, the sum bit s, is obtained by 
Si =pj XORc, w and 5; =/?/ . 
20 The logic-circuit block to implement the associate operation defined by 

equation EQ. 3 is called a Black Processor. If only 2-input NAND gates are 
available for wave-pipelined circuit design, both the associate operator o and 
the XOR need two logic stages to be implemented. But if equation EQ. 3 is 
analyzed while considering equation EQ. 2, one obtains: 

25 {g i9 Pi)o ( G,_/ , ) = g, +AG,., ,/>, />,.,) for/ > 1. 

The logic function of G k becomes a single 2-to-l MUX Such a logic 
function is available in the WTGL basic-cell library (e.g., in Figure 3d) and it 
has the same gate delays as the XOR and AND/OR gates. Therefore, the 
Black Processor has exactly one logic stage for both G, and P f , and thus the 
30 total number of logic stages is significantly reduced. The new wave-pipelined 
adder architecture is shown in Figure 6. The logic functions of pg generators 
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are described in Equation EQ. 2. The Black Processors produce G t and P ( , 
as indicated in Equation EQ. 3. In addition, the initial p f s also need to be 
latched to the sum stage, along with the carries (the delay latches for p, s are 
not explicitly shown in Figure 6). Since different driving ability is required 
5 for some cells, the delay properties of the WTGL cells with different fanouts 
were simulated, and thai the output invertor strings were fine tuned to 
balance the delay variations among the inputs of every next-stage cell. Since 
the adder has a regular architecture, the fine tuning can be easily handled. 
Some typical wave-pipelined addition operation sequences were 

10 simulated by RSIM It was found that average delay variation for SUM (the 
addition result vector) is about 0.9 ns (with SCMOS technology parameters). 
The delay variations are mainly due to the lack of more accurate and effective 
CAD tools for fine-tuning and to the intrinsic slight delay variations of the 
basic circuit cells. The new data waves are latched every 3 ns, so the data- 

15 processing speed is 333 million operations per second. 

It seems that for dual-rail wave-pipelined circuit design technique, the 
chip area and number of transistors would increase compared to other single- 
rail techniques. But the actual layout of the WTGL 16-bit adder is very 
compact and the chip size and transistor counts observed using WTGL 

20 methods are substantially smaller than those achieved using other methods. 
(This is because in the methods used by those other researchers, there is only 
one basic logic circuit — a NAND gate — available.) 

Another practical circuit is a macro-cell, the 4:2 compressor, which is 
the basic building block of a multiplier. Figure 7 is a schematic circuit 

25 diagram of an embodiment of a dual-rail WTGL 4:2 compressor design 

according to the invention. This WTGL approach takes about 120 transistors, 
but has a delay of only 3Td TG + 3T<W In Figure 7, OR/NOR gates 71 A, 
71B, and 71C each comprise a pair of gate circuits: one of the type shown in 
Figure 2b wired (as described above) to provide one of the rail polarities, and 

30 one of the type shown in Figure 2d wired to provide the other rail polarity. 
AND/NAND gates 72A, 72B ? and 72C also each comprise a pair of gate 
circuits: one of the type shown in Figure 2b to provide one of the rail 
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polarities, and one of the type shown in Figure 2d to provide the other rail 
polarity. XOR/XNOR gates 73A, 73B, 73C, and 73D each comprise a pair 
of gate circuits: one of the type shown in Figure 3a wired to provide one of 
the rail polarities as in Figure 3b, and another also of the type shown in 
5 Figure 3a, but wired to provide the other rail polarity as in Figure 3c. 

MUX/inverse-MUX gate 74 comprises a pair of gate circuits: one of the type 
shown in Figure 3a wired to provide one of the rail polarities as in Figure 3d, 
and another also of the type shown in Figure 3a, but wired to provide the 
other rail polarity as in FIG 3e. 

10 Figure 7 shows a complementary field-effect transistor 4:2 compressor 

logic circuit comprising: OR/NOR gate 71A coupled to input signal XI and 
input signal X2 and producing OR/NOR signal 81A; OR/NOR gate 71B 
coupled to input signal X3 and input signal X4 and producing OR/NOR signal 
81B; AND/NAND gate 72A coupled to OR/NOR signal 81A and OR/NOR 

15 signal 81 B and producing a C-OUT signal; AND/NAND gate 72B coupled to 
input signal XI and input signal X4 and producing AND/NAND signal 82B; 
AND/NAND gate 72C coupled to input signal X2 and input signal X3 and 
producing internal AND/NAND signal 82C; OR/NOR gate 71C coupled to 
internal AND/NAND signal 82B and internal AND/NAND 82C signal and 

20 producing OR/NOR signal 81C; XOR/XNOR gate 73A coupled to input signal 
XI and input signal X2 and producing internal XOR/XNOR signal 83 A; 
XOR/XNOR gate 73B coupled to input signal X3 and input signal X4 and 
producing internal XOR/XNOR signal 83B; XOR/XNOR gate 73C coupled to 
internal XOR/XNOR signal 83A and internal XOR/XNOR signal 83B and 

25 producing internal XOR/XNOR signal 83C; XOR/XNOR gate 73D coupled to 
internal XOR/XNOR signal 83C and a carry-in signal and producing an S 
output signal; and MUX/inverse-MUX gate 74 coupled to the carry-in signal 
and internal OR/NOR signal 81 C and selected by internal XOR/XNOR signal 
83C and producing a C output signal. 

30 The WTGL design of Figure 7 has dual-rail inputs and generates dual- 

rail outputs. A SPICE circuit simulation (using Hewlett Packard 1 .0 jim 
CMOS26 technology parameters shows that the typical WTGL total delay is 
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only 0.52 ns for the circuit of Figure 7 (which is equivalent to the sum of 3 
invertor delays and 3 transmission gate delays), with the total delay variation 
less than 15%. 

Comparisons to other wave-pipeline techniques show that the WTGL 
5 approach provides greater speed, as well as a flexible design. In addition, 
since the WTGL basic cell family is dual-rail and all the logic cells have the 
same gate delays, the basic cells were used and the routing was performed 
without inserting any padding elements. 

10 WTCLcell libraiv 

The basic cell (SCMOS) library has been implemented, and several 
experimental circuits have been designed. The research results have revealed 
that the WTGL circuit family of the invention is suitable for CMOS wave- 
pipelining design and has certain advantages over other approaches. 

15 Currently, several practical circuits are under design and will be fabricated by 
MOSIS. 

Since the possible clocking speed of a wave-pipelined circuit can be as 
high as several hundred MHz and the frequency value is crucial for the 
circuit's functionality, on-chip clock-signal generation and manipulation is 

20 necessary, and related function modules such as high-frequency flip-flops and 
shift registers are also needed for wave-pipelined circuits. Currently, single- 
phase-clocked double-edge-triggered DFF's are being used as the I/O latches 
of the wave-pipelined functional block, to fully use the clock phases (i.e., both 
the rising edges and falling edges of the clock are used to clock the latches) 

25 and to partially alleviate the specific high-speed-data requirement (the clock 
speed is half of the data processing speed using this technique). 

I^ogic Synthesis and Tuning Algorithms 

Logic synthesis has been studied and explored over the past 30 years. 
30 Traditionally, efficient methods for implementing combinational logic in 

optimal two-level form using PLAs (Programmable-Logic Arrays) have been 
popular. Multilevel logic synthesis has been used in several systems such as 
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Logic Synthesis System (LSS) and MIS system of the University of California 
- Berkeley. A widely-accepted optimization criterion is to minimize the 
physical area while simultaneously satisfying the timing constraints (typically 
the block maximum and/or minimum delay parameters) derived from a 
5 system-level analysis of the chip. 

Considering the specific characteristics of the wave-pipelining 
technique, the conventional area and timing optimization goals are no longer 
appropriate for most cases. The ideal logic-synthesis algorithms for wave- 
pipelined systems should implement a certain function with a very high degree 

10 of delay balance. Higher degree of balance will result in less total latency 
variations between different input-output paths measured by basic gate-delay 
resolution. Actually, for wave-pipelining design, rough-tuning algorithms are 
needed to modify the circuit to have the highest degree of balance by inserting 
delay elements, as described in references [Wong:89] and [Wong:93]. 

15 A tuning algorithm has been developed to go along with an ECL7CML 

wave-pipelined circuit design [Wong:93]. Notice that the rough-tuning 
algorithm assumes that an arbitrary combinational block is already available 
for modification, without considering the specific synthesis strategies required 
for wave-pipelining. It is believed necessary to integrate the logic synthesis 

20 and rough-tuning algorithms, since they are closely related in terms of the 
general optimization goals such as area and timing constraints. In addition, 
because the WTGL technique has several available basic logic-circuit cells, 
each with the same gate-delay timing, the logic-synthesis and rough-tuning 
algorithms using WTGL as target technology would be more efficient. Also, 

25 the availability of various basic cells with substantially equal delays 

necessitates a fresh look at the Boolean minimization problem to effectively 
utilize the logic blocks which are available using WTGL. 

The Binary Decision Diagrams (BDDs) method has been widely used 
for logic verification and manipulation as described in references [Akers:78] 

30 and [Bryant: 86]. A more general form, called If-Then-Else Directed Acyclic 
Graphs (DAGs), has been successfully used for multi-level logic minimization 
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as described in references [Karplus:89] and [Karplus:91], with FPGAs as 
target technology. 

All the basic circuit cells of WTGL can be defined by exactly one if- 
then-else operator: 
5 ab = (if a then b else FALSE) 

a+b = (if a then TRUE else b) 

a XOR b = (if a then E else b) 

ac+bc = (if c then a else b) 

In this case, the logic function, as well as the node-timing information 

10 measured by basic gate delays, can be represented by if-then-else DAGs. 
Meanwhile, the rough-tuning algorithms can also use the Directed Acyclic 
Graph (DAG) representation of circuits, as described in reference [Wong:93]. 
Therefore, from this common starting point, it would be highly feasible to 
integrate the logic synthesis and rough tuning to generate more efficient and 

15 powerful CAD tools for wave-pipelined circuit implementations with the 
WTGL basic cell library as target technology. 

For the fine-tuning algorithm, the detailed timing analysis of CMOS 
circuits will be focused upon, especially Transmission-Gate-based circuits. As 
the equivalent circuits of WTGL basic cells in Figures 5a through 5j indicate, 

20 the gate delay can be determined by analysis of only a few charge and 
discharge equivalent circuits. Therefore, simple but effective delay models 
(which include the fanout information) play a significant role in the 
development of actual fine-tuning CAD algorithms. In addition, post-layout 
circuit extrartion/simulation procedures should also be incorporated with the 

25 fine-tuning CAD tools. 

Wave-pipelining using the WTGL circuit family shows enormous 
potential for high-speed digital-system design. As the research described 
above has shown, the WTGL family, by providing substantially equal rise and 
fall times, and reduced input-pattern-caused gate-delay variations, overcomes 

30 strict timing constraints often required by other wave-pipelining methods, 
without sacrificing the basic advantages of the wave-pipelined systems. The 
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analysis on the basic cells of the invention has shown a significant 
performance improvement over wave-pipelined systems formed by previously 
described methods. 

The WTGL family of basic cells having substantially equal delays 
5 allows the development of compact, high-speed circuits through logic 
synthesis. For instance, the WTGL technique has been shown to be a 
promising method for high-performance CFET wave-pipelined circuit design. 
It can be used in applications such as high-speed arithmetic units for high- 
performance computing systems and high-throughput digital-signal processors 

10 for pattern-recognition and image-processing systems. This wil! surely help in 
the development of high-speed systems of the future. 

It is to be understood that the above description is intended to be 
illustrative, and not restrictive. Many other embodiments will be apparent to 
those of skill in the art upon reviewing the above description. For instance, 

15 an enhancement-mode Insulated-Gate Field-Effect Transistor (IGFET) 

technology is used in some of the embodiments (e.g., Figures 2b, 2d, and 3) 
described above, but a person skilled in the art could use an analogous 
method using, e.g., depletion-mode devices, or MESFETs. The scope of the 
invention should, therefore, be determined with reference to the appended 

20 claims, along with the full scope of equivalents to which such claims are 
entitled. 
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WHAT IS CLAIMED IS: 



1 . A logic circuit for performing a logic function, said logic circuit 
comprising: 

5 a first field-effect transistor having a first terminal coupled to a first 

input signal, a second terminal coupled to an internal node, and a gate coupled 
to a second input signal; 

a second field-effect transistor having a first terminal coupled to said 
first input signal, a second terminal coupled to said internal node, and a gate 
10 coupled to a logical complement of said second input signal; 

a third field-effect transistor having a first terminal coupled to a 
voltage source, a second terminal coupled to said internal node, and a gate 
coupled to said second input signal; 

a driver coupled to said internal node, said driver comprising means 
15 for amplifying a voltage, adjusting logic levels, and providing an output 
signal; and 

a plurality of paths to said output from said first input signal, said 
second input signal, and said logical complement of said second input signal, 
respectively, each of said plurality of paths having substantially equal delays. 

20 

2. A logic circuit according to claim 1 wherein: 

said first field-effect transistor has a first channel type; 
said second field-effect transistor has a second channel type which is 
complementary to said first channel type; and 
25 said third field-effect transistor has said second channel type. 



3. A logic circuit according to claim 2 wherein: 

said first channel type is an N-channel enhancement-mode channel; 
said second channel type is a P-channel enhancement-mode channel; 

30 and 



said voltage source is 
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4. A logic circuit according to claim 3 wherein: 
said logic function is a logical AND. 

5. A logic circuit according to claim 3 wherein: 
said logic function is a logical NOR- 



6. A logic circuit according to claim 2 wherein: 

said first channel type is a P-channel enhancement-mode channel; 
said second channel type is an N-channel enhancement-mode channel; 

10 and 

said voltage source is ground 



7. A logic circuit according to claim 6 wherein: 
said logic function is a logical NAND. 

15 

8. A logic circuit according to claim 6 wherein: 
said logic function is a logical OR. 



9. A logic circuit comprising: 
20 a transmission gate having a first terminal coupled to a first input 

signal, a second terminal coupled to a node, and a gate coupled to a second 
input signal; 

a pull transistor having a first terminal coupled to a voltage source, a 
second terminal coupled to said node, and a gate coupled to a third input 
25 signal; and 

a driver coupled to said node, said driver comprising means for 
amplifying a voltage, adjusting logic levels, and providing an output signal. 



10. A logic circuit according to claim 9 wherein said transmission gate 
30 comprises complementary field-effect transistors. 
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11. A logic circuit according to claim 10 wherein said complementary 
field-effect transistors comprise: 

at least one N-channel enhancement-mode transistor; and 
at least one P-channel enhancement-mode transistor. 

5 

12. A complementary field-effect transistor logic circuit comprising: 

a first pass transistor having a first channel type and having a first 
terminal coupled to a first input signal, a second terminal coupled to an 
internal node, and a gate coupled to a second input signal; 
10 a second pass transistor having a second channel type which is 

complementary to said first channel type and having a first terminal coupled 
to said first input signal, a second terminal coupled to said internal node, and 
a gate coupled to a logical complement of said second input signal; 

a third transistor having said second channel type and having a first 
15 terminal coupled to a voltage terminal, a second terminal coupled to said 
internal node, and a gate coupled to said second input signal; and 

a driver coupled to said internal node, comprising means for 
amplifying a voltage and adjusting logic levels at an output signal. 



20 13. A family of logic circuits for performing logic functions, said family 
comprising: 

a first logic circuit comprising: 

a first field-effect transistor having a first terminal 
coupled to a first input signal, a second terminal coupled to a 
25 first internal node, and a gate coupled to a second input signal; 

a second field-effect transistor having a first terminal 
coupled to said first input signal, a second terminal coupled to 
said first internal node, and a gate coupled to a logical 
complement of said second input signal; 
30 a third field-effect transistor having a first terminal 

coupled to a voltage source, a second terminal coupled to said 
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first internal node, and a gate coupled to said second input 
signal; and 

a first driver coupled to said first internal node, said first 
driver comprising means for amplifying a voltage, adjusting 
5 logic levels, and providing a first output signal; 

a second logic circuit comprising: 

a first and a second transmission gate, wherein said first 
and second transmission gates have substantially equivalent 
output transition times and substantially equivalent propagation 
10 delay times, and wherein each of said first and second 

transmission gates comprise: 

a fourth field-effect transistor having a first 
terminal coupled to a third input signal, a second 
terminal coupled to a second internal node, and a gate 
15 coupled to a fourth input signal; and 

a fifth field-effect transistor having a first 
terminal coupled to said third input signal, a second 
terminal coupled to said second internal node, and a gate 
coupled to a logical complement of said fourth input 
20 signal; and 

a second driver coupled to said internal nodes of each of 
said first and second transmission gates, wherein said driver 
comprises means for amplifying a voltage, adjusting logic 
levels, and providing a second output signal; and 
25 wherein said first and second logic circuits have substantially equivalent 
output transition times and substantially equivalent propagation delay times. 

14. A logic circuit for performing a wave-pipelined logic function, said 
logic circuit comprising: 
30 a first, second, third, and fourth logic circuit, each of said logic circuits 

comprising: 
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a first field-effect transistor (FET) having a first terminal 
coupled to a first input signal, a second terminal coupled to a 
first internal node, and a gate coupled to a second input signal; 
a second FET having a first terminal coupled to said 
5 first input signal, a second terminal coupled to said first internal 

node, and a gate coupled to a logical complement of said 
second input signal; 

a third FET having a first terminal coupled to a voltage 
source, a second terminal coupled to said first internal node, 
10 and a gate coupled to said second input signal; and 

a driver coupled to said first internal node, said driver 
comprising means for amplifying a voltage, adjusting logic 
levels, and providing a first output signal; 
said first and second logic circuits forming a first stage and said third and 
15 fourth logic circuits forming a second stage, said output of said first logic 
circuit being coupled as said second input signal for said third logic circuit, 
said output of said second logic circuit being coupled as said logical 
complement of second input signal for said third logic circuit, said first FET 
of said first and third logic circuits and said second and third FETs of said 
20 second and fourth logic circuits each having a first channel type, said first 
FET of said second and fourth logic circuits and said second and third FETs 
of said first and third logic circuits each having a second channel type, and 
said first, second, third, and fourth logic circuits having substantially 
equivalent output transition times and substantially equivalent propagation 
25 delay times. 



15. A logic circuit for performing a logic function which generates a 
result the result having both a first output signal and a second output signal, 
the second output signal being a logical complement of the first output signal, 
30 both output signals having substantially equivalent output transition times and 
substantially equivalent propagation delay times relative to any input signal 
transition, the circuit comprising: 
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a first field-effect transistor having a first terminal coupled to a first 
input signal, a second terminal coupled to a first internal node, and a gate 
coupled to a second input signal; 

a second field-effect transistor having a first terminal coupled to the 
5 first input signal, a second terminal coupled to the first internal node, and a 
gate coupled to a logical complement of the second input signal; 

a third field-effect transistor having a first terminal coupled to a first 
voltage source, a second terminal coupled to the first internal node, and a gate 
coupled to the logical complement of the second input signal; 
10 a first driver coupled to the first internal node, the first driver 

comprising means for amplifying a voltage, adjusting logic levels, and 
providing the first output signal; 

a fourth field-effect transistor having a first terminal coupled to a 
logical complement of the first input signal, a second terminal coupled to a 
15 second internal node, and a gate coupled to the second input signal; 

a fifth field-effect transistor having a first terminal coupled to the 
logical complement of the first input signal, a second terminal coupled to the 
second internal node, and a gate coupled to the logical complement of the 
second input signal; 
20 a sixth field-effect transistor having a first terminal coupled to a 

second voltage source, a second terminal coupled to the second internal node, 
and a gate coupled to the second input signal; and 

a second driver coupled to the second internal node, the second driver 
comprising means for amplifying a voltage, adjusting logic levels, and 
25 providing the second output signal. 
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